Skip to content

Conversation

@Adamtaranto
Copy link
Owner

@Adamtaranto Adamtaranto commented May 14, 2025

Experimenting with multithreading of alignment processing.

*Did not end up using dask.

@Adamtaranto
Copy link
Owner Author

Adamtaranto commented May 14, 2025

Needs more work. Runtime is not significantly different from single threaded mode atm.

Cols per second seem to slow toward end of the run for some reason.

@Adamtaranto
Copy link
Owner Author

I think the Biopython MSA data structure is not suitable for multithreaded processing. Try converting the alignment to a Polars dataframe before processing and then back to MSA afterwards.

@Adamtaranto
Copy link
Owner Author

Tested with numpy array as data structure. Speed is limited by complex dict lookups. RIP calculation step needs restructuring before multithreading will be useful.

@Adamtaranto Adamtaranto closed this Jun 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants